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ABSTRACT 

We  consider  the  minimization  of  a function  which  is  the  maximum  of  a 
finite  number  of  smooth  but  nonlinear  functions.  It  is  well-known  that 
the  minimax  problem  of  this  type  connects  naturally  to  a nonlinear  program. 
Through  this  connection  the  effective  quasi-Newton  method  becomes  appli- 
cable. We  show  that  this  approach  is  valid  and  the  resulting  method  has 
global  convergence  properties. 
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SIGNIFICANCE  AND  EXPLANATION 


I 


Minimax  is  an  important  principle  in  optimal  selection  of  parameters 
in  the  processing  of  empirical  data  and  abounds  with  applications  in 
Economics,  Statistics,  Engineering  and  many  other  areas.  The  type  of 
minimax  problem  considered  here  is  to  minimize  a function  which  is  the 
maximum  of  a finite  number  of  smooth  but  nonlinear  functions.  Because 
the  maximum  function  is  usually  not  smooth,  most  optimization  techniques 
are  not  suitable  for  handling  it.  However,  by  transforming  the  problem 
into  a equivalent  nonlinear  program  the  efficient  quasi -Newton  method 
becomes  applicable.  This  work  shows  that  this  approach  is  valid  and 
effective. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
sumaary  lies  with  MRC,  and  not  with  the  author  of  this  report. 


ON  THE  VALIDITY  OF  A NONLINEAR  PROGRAMMING 
METHOD  FOR  SOLVING  MINIMAX  PROBLEMS 


Shih-Ping  Han 


I.  Introduction 

He  are  dealing  the  following  minimax  problem 


minimize  <f  (x) 


(1.1) 


xeR 


where  f (x)  » max  {f  (x)}  and  f. 's  are  continuously  differentiable  real-valued 
i-l,...,m  1 


functions  defined  on  R . The  function  <f  is  usually  not  differentiable  at  a solution 
point;  therefore,  most  unconstrained  optimization  methods  are  no  longer  appropriate  for 
handling  it.  But,  Problem  (1.1)  can  be  put  into  the  following  equivalent  nonlinear 
programming  form 


minimize  n 


(1.2) 


(x,n)*  R 


n+l 


s. t.  fi (x)  < n i - 1, . . . ,m  . 


Hence,  for  solving  the  minimax  problem  (1.1)  it  seems  feasible  to  use  an  effective  non- 
linear programing  method  to  solve  (1.2).  A purpose  of  this  paper  is  to  demonstrate 
how  the  successful  quasi-Newton  method  described  in  (3,  4,  8)  can  be  so  used.  The 
special  structure  that  Problem  (1.2)  is  linear  in  the  variable  n should  be  exploited. 
But,  with  this  being  done,  the  global  convergence  theorems  in  (3)  can  no  longer  apply 
here.  We  show  in  this  paper  that  the  resulting  method  is  still  a valid  one  and  global 
convergence  is  still  achievable. 

He  describe  the  method  in  Section  2.  Section  3 is  devoted  to  the  justification 
of  our  approach.  The  global  convergence  theorem  for  the  method  is  given  in  Section  4. 
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2.  The  Method 


We  are  content  with  finding  a stationary  point  of  Problem  (1.1),  by  which  it  is 
meant  a point,  x say,  in  Rn  that  satisfies  the  condition 
(2.1)  min  {^'(x;d)  s ||d||2“  1}  < 0 , 

where  w*  (x;d)  is  the  directional  derivative  of  •f  at  x in  the  direction  d. 

Clearly,  Condition  (2.1)  is  a necessary  condition  for  the  point  x to  be  a solution 

of  (1.1)  and  it  will  be  reduced  to  the  condition  V*  (x)  - 0 when  m « 1.  Let 

I(x)  - (i  : f (x)  * *(x)}  and  let  f.  be  called  active  at  x if  i e I(x).  It  is 
i i 

known  [2,  for  instance]  that  x is  stationary  if  and  only  if  there  exists  an  m-vector 
v such  that 


(2.2) 


m 

(a)  £ v.Vf . (x)  » 0 i 

i-1  1 1 


m 

(b)  I v “ 1 ( 
i-1 

(c)  v > 0 ; 

(d)  v - 0 if  i 1 I(x) 


Notice  that  Condition  (2.2)  is  just  the  Karush-Kuhn-Tucker  condition  of  the  nonlinear 
programming  problem  (1.2). 

The  proposed  method  is  an  iterative  process.  At  the  k-th  iteration  we  have  an 
estimate  xfc  of  a solution  and  have  also  a scaler  which  is  a predicted  optimal 

value  of  the  objective  function  t . To  construct  new  estimates  x^+1  and  nk+1  we 
solve  the  following  quadratic  program 


(2.3) 


1 T 

minimize  — d B d +5 
(d.«)tRn+1 


s.t. 


VV  +Vfi(VTdink  + < 


i - 1, . . . ,m  . 


Here,  is  a positive  definite  nxn  matrix,  preferably  a good  approximation  to  the 


Hessian  T v It,  (x)  of  the  Lagrangian  of  (1.2)  and  updated  by  Powell's  scheme  (9) 
i-1  1 1 


-2- 


*-t  «VV  **  a solution  of  (2.3).  Then  we  set  (xjt+i'njt+i)  ” (*k'nk>  + • 

where  Xfc  is  the  stepsize  determined  by  doing  an  exact  line-search  in  the  direction 
(v  5^)  on  the  function  0 defined  by 

m 


(2.4) 


that  is. 


9(x,n)  « n + J max{f  (x)  - n,0)  I 
i-1 


e,Vl'W  “ + Xdk'nk  + XV  • 

a a 

It  is  merely  for  simplicity  that  we  consider  the  exact  line-search  here.  An  analysis 
of  some  inexact  line-search  is  possible  and  will  be  very  similar  to  the  one  given  in 
IS1 , where  the  determination  of  the  stepsite  X^  is  based  on  the  objective  function 
* instead  of  6.  There  are  some  advantages  to  use  the  function  0 because  it 
takes  into  consideration  some  inactive  functions  f^,  while  the  function  <fi  gives 
bias  completely  to  the  active  ones. 

The  method  described  above  is  essentially  an  application  of  the  method  in  (3]  to 
Problem  (1.2)  with  its  special  structure  being  taken  into  account.  The  problem  con- 
sidered there  is  the  general  nonlinear  programing  problem 

min  g(x) 

s.t.  f^x)  <0  i • 1, . . . ,m 


(2.4) 


and  the  subproblem  to  be  solved  in  each  iteration  is  the  quadratic  program 


min  Vg(x.  ) d + ~ d A^d 


(2.5) 


dr  R 
s.t. 


fi(xfc)  + Vf± (x^)  d <0  i - 1,...,« 
The  stepsises  are  determined  by  the  exact  penalty  function 


p(x,a)  « g(x)  + a £ max{f  (x),0)  . 

i-1  1 

When  gxTx  > xTA^x  > yxTx  for  soeie  positive  mmfeers  6 and  y and  for  each  k and 
x,  and  when  the  Lagrange  multipliers  of  (2.5)  are  uniformly  bounded  by  a in  the 
—norm  then  it  is  shown  in  (3]  that  any  accumulation  point  of  the  generated  points 
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{x^}  is  a Kuhn-Tucker  point  of  (2.4) . Because  Problem  (1.2)  is  linear  in  n»  hence 
the  quadratic  program  (2.3)  should  preserve  this  property.  Though  we  usually  require 
the  matrix  to  be  positive  definite.  Problem  (2.3)  is  no  longer  a positive  definite 

quadratic  program  in  the  space  Rn+1.  This  makes  the  global  convergence  theorems  in 
(3]  not  applicable  here.  Some  justification  of  our  approach  becomes  necessary  and  will 
be  given  in  the  following  sections. 

! 


3.  Validity  of  the  method 


(c)  f^x)  + (x)Td  < n + 6,  i = ; 

(d)  v > 0 ; 

(e)  v.  = 0 if  f . (x)  + Vf.  (x)Td  < n + 6 . 

i ii 

We  first  show  that  the  method  is  well-defined  in  the  sense  that  a search  direction 
can  always  be  uniquely  determined  in  each  iteration. 

Theorem  3.1.  Let  (x,tl)  be  any  point  in  Rn+1  and  let  B be  a positive  definite 
nxn  matrix.  Then  there  exists  a unique  solution  (d,6)  of  the  quadratic  program  (3.1) 
Furthermore,  x is  a stationary  point  of  (1.1)  if  and  only  if  d « 0 and  4 « * (x)  - n 
Proof.  To  show  the  existence  of  a solution  for  (3.1)  we  first  note  that  its  feasible 
region  is  obviously  non-empty.  Because  of  the  convexity  we  only  need  to  show  that  the 
objective  function  is  bounded  below  in  the  feasible  region.  This  can  be  done  by 
considering  the  dual  problem 


I 


TIT  -1  T 

max  f(x)  v - — v Vf(x)B  Vf(x)  v 

m 

veR 

m 

s.t.  I V - 1 , 

i-1  1 

v > 0 . 

Clearly,  the  dual  problem  is  also  feasible.  Therefore,  it  follows  from  the  duality 
theorem  [7)  that  the  optimal  value  of  (3.1)  is  finite  and,  hence,  a solution  exists. 

Let  (d,6)  be  a solution  of  (3.1).  We  want  to  show  that  (d,6)  is  the  only  one. 

_ _ t 

From  a result  in  (6,  Corollary  3.6]  the  solution  (d,6)  is  unique  if  d Bd  > 0 for 
any  non-zero  vector  (d,6)  in  Rn+^  satisfying 

(3.3)  (a)  dTd  + 6 < 0 , 

(b)  7fi(x)Td  < 6 for  each  i t J , 

where  J « { j s f^  (x)  + Vf^fx^d  ■ n + 5}  . Because  B is  positive  definite  we  only 
need  to  show  that  if  (d,6)  satisfies  (3.3)  and  d « 0 then  6 must  also  be  zero. 

Note  that  it  follows  from  (3.2.b)  and  (3.2.e)  that  the  index  set  J can  not  be  empty. 

Therefore,  if  d » 0 and  if  (d,6)  satisfies  <3.3.b)  for  some  i in  J then  6 > 0. 

But,  from  (3. 3. a)  and  d = 0 we  also  have  i < 0.  Hence,  6 « 0 and  the  uniqueness 
of  the  solution  is  proven. 

The  second  part  of  the  theorem  follows  straightforwardly  from  (3.2)  and  the 
uniqueness  of  the  solution.  Q.E.D. 

It  may  be  worthwhile  noticing  that  the  solution  vector  d is  independent  of  the 
given  value  n.  Therefore,  a bad  estimate  in  n should  usually  not  spoil 
a good  estimate  in  x. 

The  search  direction  (d,6)  is  not  only  well-defined  but  also  useful  because  it 
is  descent  for  our  optimality  function  0.  Before  giving  this  result  we  note  here 
that  the  synfcol  0'(x,njd,5)  denotes  the  directional  derivative  of  0 at  (x,n)  in 
the  direction  (d,5). 

V 
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■■■■ 


and  B be  a positive  definite  and 


Theorem  3.2.  Let  (x,n)  be  a given  point  in  Rn+-1' 
symmetric  nxn  matrix.  Let  (d,5)  be  the  unique  solution  of  (3.1). 
8’ (x,n;d,5)  < - dTBd. 


Then 


Proof.  Let  I+  - U : f^x)  > n),  IQ  - U ; f^x)  - n),  and  I_  - {i  ; f^x)  < n}  . 
Then  from  a result  of  Danskin  [1]  we  have 


e'(x,n»d,5)  - 5 + l (Vf  <x)Td  - 5)  + l max{Vf  (xj’d  - 5,0}  . 

X*  *0 

Because  (d,5)  is  also  a Karush-Kuhn- Tucker  point  of  (3.1)  there  exists  an  m-vector  v 
such  that  (3.2)  holds.  It  follows  from  (3.2.c)  that  if  i * IQ  then 
max(7fi(x)Td  - 5,0}  » 0.  It  also  follows  from  (3.2)  that 

m 

0' (x,n<d,5)  - -dTBd  - l v Vf  (x)Td  + 5 + l (Vf  (x)Td  - 5) 

1-1  ^ ^ T “ 


_ Ml 

< -d  Bd  - l v (5  ♦ n - f . (x) ) + 5 + l (n  - f . (x) ) 

i-1  1 I+ 

< -dTBd  + l (1  - v.) (n  - f. <x))  < -dTBd  . Q.E.D. 

« “ i i » 

I 

+ 

The  following  corollary  justifies  the  use  of  function  8 as  an  optimality  function 
for  solving  the  minimax  problem  (1.1). 

Corollary  3.3.  If  (x,n)  is  a local  minimum  point  of  function  6 then  x is  a 
stationary  point  of  Problem  (1.1). 

Proof.  Consider  the  quadratic  program  (3.1)  with  B being  any  positive  definite  and 
syssmtric  matrix.  Let  (d,5)  be  its  solution.  The  vector  d must  be  zero;  otherwise, 
8'(x,n>d,5)  < 0 which  contradicts  that  (x,n>  is  a local  minimum.  Hence,  the  result 
follows  immediately  from  Theorem  3.1.  Q.E.D. 

We  also  observe  that  both  Problem  (1.2)  and  (3.1)  satisfy  the  Arrow-Hurwicz-Uzawa 
constraint  qualification  [see  7,  for  instance].  From  a result  in  [10;  Theorems  1 and  3] 
the  feasible  regions  of  (1.2)  and  (3.1)  are  stable  when  they  are  subjected  to  small 
perturbations.  This  property  is  very  desirable  and  makes  our  approach  very  useful 
pratically. 


* j 


j 
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4 . Global  convergence 


He  study  in  this  section  the  global  convergence  properties  of  the  method.  Here 
we  need  a very  useful  result  of  Robinson  111]  on  the  stability  of  quadratic  programs. 
Actually,  Robinson  considers  the  stability  of  a very  general  class  of  problems,  called 
generalized  equations  by  him.  The  following  lerana  is  a straightforward  consequence  of 
his  Theorem  2. 

» 4.1.  Let  (x,n)  be  a point  in  Rn+*  and  let  B be  a positive  definite  nxn 
matrix,  and  let  (d,$)  be  the  unique  solution  of  (3.1).  Then  there  exist  constants 
X and  e such  that  for  each  nxn  matrix  B'  and  each  (x',n')  in  Rn+*  with 
c1  max{  ||b'  - b||  2»  ||  (x'  ,n* ) - <x,n)  ||  2)  < e 

the  quadratic  program 

sdn  — dTB'd  + S 
(d.SXR11"1 

s.t.  Vf.(x')  + f.(x')Td  < n'  + 6 i - l,...,m 
1 1 * 

has  a unique  solution  (d',6')  and 

||  (5- ,«• ) - (d,,«,)||2<  Xe  ■ (1  - Xe,)"1(l  ♦ ||  (3,5)  ||  2»  . 

We  now  give  the  global  convergence  theorem  below. 

Theorem  4.2.  Let  { BL  } be  a sequence  of  nxn  sysnetric  matrices  satisfying  that  for 
some  positive  nimfcers  a and  B and  for  each  k 

BxTx  < xTBk«  < axTx  for  any  x in  Rn. 

Let  {(x^n  >>  be  a sequence  of  points  in  Rn+1  generated  by  the  method  from  any  given 
starting  point  (x0,nQ)  and  let  (x,n)  be  any  accimmlation  point  of  this  sequence. 

Then  x is  a stationary  point  of  the  minimax  problem  (1.1). 

Proofi  Without  loss  of  generality  we  may  assube  that  (x^.n^)  (x,n) • By  passing  to 

a subsequence,  if  necessary,  we  have  a positive  definite  and  sysnetric  matrix  B such 
that  -»  B.  Consider  the  quadratic  program 


t 


1 T- 

min  — d Bd  + 6 
(d,«)*Rn+1 

s.t.  rix)  + Vf.  (x)Td  < n + 6 i - 1, . . . ,m  . 

By  Theorem  3.1  the  quadratic  program  has  a unique  solution,  (5,6)  s*y.  If  d = 0 
then  by  Theorem  3.1  again  x is  a stationary  point  of  (1.1)  and  the  proof  is  done. 

If  d + 0 we  will  deduce  a contradiction. 

Define  a point  (x,n)  by  (x,n)  - (x,n)  + X(d,6)  where  0 < X < 1 is  chosen 

s as 

so  that 

9(x,n)  = min  6 (x  + Xd,n  + X6)  . 

0<X<1 

s as 

Because  Theorem  3.2  and  d 0,  the  number  y :»  0(x,n)  - 0(x,n)  is  positive.  By 
Lemma  4.1  we  also  have  (<1^,6^)  * W,5),  here  again  {(d^.6^)}  may  be  only  a 
subsequence.  Therefore,  there  exists  an  arbitrarily  large  )c  such  that 

e(WVi|  ” e(\  + W\  + W 

se(\  + JV\  + xV 

< 0 (x,fi)  + y y 

< 8(x,n)  . 

This  contradicts  the  fact  that  the  sequence  {8(x^,n^)}  is  monoton  decreasing  and 

9<Xk+l'nk+l>  > Q.E.D. 

The  sequence  of  points  can  be  shewn  to  converge  to  a solution  point  when  we  assume 
that  the  functions  f ^ are  convex.  But,  from  Theorem  4.2  our  method  should  be  expected 
to  work  well  even  when  the  functions  are  not  convex. 


1 
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Minimax,  Nonlinear  programming,  Quasi-Newton  method 
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we  consider  the  minimization  of  a function  which  is  the  maximum  of  a 
finite  number  of  smooth  but  nonlinear  functions.  It  is  well-known  that 
the  minimax  problem  of  this  type  connects  naturally  to  a nonlinear  program. 
Through  this  connection  the  effective  quasi-Newton  method  becomes  appli- 
cable. We  show  that  this  approach  is  valid  and  the  resulting  method  has 
global  convergence  properties. 
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